In this example, we are going to apply a CNN to classify dogs vs. cats images. This will walk you through the fundamentals of importing images, applying image augmentation, and performing classification on them.
Learning objectives:
We are going to use the Dogs vs. Cats Kaggle competition data set (https://www.kaggle.com/c/dogs-vs-cats/data). However, do to size and runtime limitations, we are going to only use a subset of the data. We have already set up the directories which look like:
- data
└── dogs-vs-cats
└── train
└── cats
├── cat.1.jpg
├── cat.2.jpg
└── ...
└── dogs
├── dog.1.jpg
├── dog.2.jpg
└── ...
└── validation
├── cats
└── dogs
└── test
├── cats
└── dogs
# define the directories:
image_dir <- here::here("docs", "data", "dogs-vs-cats")
train_dir <- file.path(image_dir, "train")
valid_dir <- file.path(image_dir, "validation")
test_dir <- file.path(image_dir, "test")
# create train, validation, and test file paths for cat images
train_cats_dir <- file.path(train_dir, "cats")
valid_cats_dir <- file.path(valid_dir, "cats")
test_cats_dir <- file.path(test_dir, "cats")
# create train, validation, and test file paths for cat images
train_dogs_dir <- file.path(train_dir, "dogs")
valid_dogs_dir <- file.path(valid_dir, "dogs")
test_dogs_dir <- file.path(test_dir, "dogs")
Although there are 25,000 images in this data set, we are going to use a very small subset, which includes:
cat("Cat images:", "\n")
Cat images:
cat(" - total training cat images:", length(list.files(train_cats_dir)), "\n")
- total training cat images: 1000
cat(" - total validation cat images:", length(list.files(valid_cats_dir)), "\n")
- total validation cat images: 500
cat(" - total test cat images:", length(list.files(test_cats_dir)), "\n\n")
- total test cat images: 500
cat("Dog images:", "\n")
Dog images:
cat(" - total training dog images:", length(list.files(train_dogs_dir)), "\n")
- total training dog images: 1000
cat(" - total validation dog images:", length(list.files(valid_dogs_dir)), "\n")
- total validation dog images: 500
cat(" - total test dog images:", length(list.files(test_dogs_dir)), "\n")
- total test dog images: 500
Let’s check out the first 10 cat and dog images:
op <- par(mfrow = c(4, 5), pty = "s", mar = c(0.1, 0.1, 0.1, 0.1))
for (i in 1:10) {
plot(as.raster(jpeg::readJPEG(paste0(train_cats_dir, "/cat.", i, ".jpg"))))
plot(as.raster(jpeg::readJPEG(paste0(train_dogs_dir, "/dog.", i, ".jpg"))))
}
par(op)
We’re going to set up a simple CNN model that contains steps you saw in the previous module. This CNN includes:
model <- keras_model_sequential() %>%
layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = "relu",
input_shape = c(150, 150, 3)) %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_flatten() %>%
layer_dense(units = 512, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
summary(model)
Model: "sequential_1"
______________________________________________________________________________
Layer (type) Output Shape Param #
==============================================================================
conv2d (Conv2D) (None, 148, 148, 32) 896
______________________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32) 0
______________________________________________________________________________
conv2d_1 (Conv2D) (None, 72, 72, 64) 18496
______________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 36, 36, 64) 0
______________________________________________________________________________
conv2d_2 (Conv2D) (None, 34, 34, 128) 73856
______________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 17, 17, 128) 0
______________________________________________________________________________
conv2d_3 (Conv2D) (None, 15, 15, 128) 147584
______________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 7, 7, 128) 0
______________________________________________________________________________
flatten (Flatten) (None, 6272) 0
______________________________________________________________________________
dense_2 (Dense) (None, 512) 3211776
______________________________________________________________________________
dense_3 (Dense) (None, 1) 513
==============================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
______________________________________________________________________________
Compile the model:
model %>% compile(
loss = "binary_crossentropy",
optimizer = optimizer_rmsprop(lr = 1e-4),
metrics = "accuracy"
)
image_data_generator will:
image_data_generator provides other capabilities that we’ll look at shortly.
flow_images_from_directory will:
image_data_generatortrain_datagen <- image_data_generator(rescale = 1/255)
valid_datagen <- image_data_generator(rescale = 1/255)
train_generator <- flow_images_from_directory(
train_dir,
train_datagen,
target_size = c(150, 150),
batch_size = 20,
class_mode = "binary"
)
validation_generator <- flow_images_from_directory(
valid_dir,
valid_datagen,
target_size = c(150, 150),
batch_size = 20,
class_mode = "binary",
seed = 123L
)
If we get the first batch from the generator, you will see that it yields 20 images of 150x150 pixels with three channels (20, 150, 150, 3) along with their binary labels (0, 1).
batch <- generator_next(train_generator)
str(batch)
List of 2
$ : num [1:20, 1:150, 1:150, 1:3] 0.161 0.122 0.486 0.482 0.882 ...
$ : num [1:20(1d)] 1 0 1 1 0 1 1 0 0 1 ...
To train our model we’ll use fit_generator which is the equivalent of fit for data generators. We provide it our generators for the training and validation data. Plus, we need to specify:
steps_per_epoch: how many samples to draw from the training generator before declaring an epoch over. Our generator supplies batches of 20 and we have 2,000 training images so we need 100 steps.validation_steps: how many samples to draw from the validation generator. Our generator supplies batches of 20 and we have 1,000 validation images so we need 50 steps.Without a GPU this will take approximately 20 minutes to train
history <- model %>% fit_generator(
train_generator,
steps_per_epoch = 100,
epochs = 30,
validation_data = validation_generator,
validation_steps = 50
)
View history
plot(history)
model %>% save_model_hdf5("cats_and_dogs_small_1.h5")
Our model above does ok but definitely has room for improvement. One approach to improve performance is to collect more data. Unfortunately, this is not always an option. An alternative is to use data augmentation.
datagen <- image_data_generator(
rescale = 1/255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = TRUE,
fill_mode = "nearest"
)
The following helps to visualize the idea of image augmentation by:
fnames <- list.files(train_cats_dir, full.names = TRUE)
img_path <- fnames[[1]]
img <- image_load(img_path, target_size = c(150, 150))
img_array <- image_to_array(img)
img_array <- array_reshape(img_array, c(1, 150, 150, 3))
augmentation_generator <- flow_images_from_data(
img_array,
generator = datagen,
batch_size = 1
)
WARNING: Logging before flag parsing goes to stderr.
W1019 11:38:47.535993 140736703529920 lazy_loader.py:50]
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
op <- par(mfrow = c(2, 5), pty = "s", mar = c(0, 0.1, 0, 0.1))
for (i in 1:10) {
batch <- generator_next(augmentation_generator)
plot(as.raster(batch[1,,,]))
}
par(op)
Let’s create a new model that includes image augmentation and we’ll apply the dropout regularization method. The following creates a CNN architecture with:
model <- keras_model_sequential() %>%
layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = "relu", input_shape = c(150, 150, 3)) %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_conv_2d(filters = 128, kernel_size = c(3, 3), activation = "relu") %>%
layer_max_pooling_2d(pool_size = c(2, 2)) %>%
layer_flatten() %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 512, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
model %>% compile(
loss = "binary_crossentropy",
optimizer = optimizer_rmsprop(lr = 1e-4),
metrics = "accuracy"
)
Now we can add image augmentation to our image_data_generator(). The rest of the inputs remain the same; however, we use larger batch size (32) and we do not need to worry about batch_size x steps_per_epoch equaling the number of training images since we are doing image augmentation.
# only augment training data
train_datagen <- image_data_generator(
rescale = 1/255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = TRUE,
)
# do not augment test and validation data
test_datagen <- image_data_generator(rescale = 1/255)
# generate batches of data from training directory
train_generator <- flow_images_from_directory(
train_dir,
train_datagen,
target_size = c(150, 150),
batch_size = 32,
class_mode = "binary"
)
# generate batches of data from validation directory
validation_generator <- flow_images_from_directory(
valid_dir,
test_datagen,
target_size = c(150, 150),
batch_size = 32,
class_mode = "binary"
)
# train model
history <- model %>%
fit_generator(
train_generator,
steps_per_epoch = 100,
epochs = 100,
validation_data = validation_generator,
validation_steps = 50
)
model %>% save_model_hdf5("cats_and_dogs_small_2.h5")